HappyDB is a corpus of 100,000 crowd-sourced happy moments via Amazon’s Mechanical Turk. You can read more about it on https://arxiv.org/abs/1801.07746
We first clean the text. This analysis mainly uses clean text and attributes of different people, such as age, country, gender, marital and parenthood.
Let’s see the Maslow’s Hierarchy of Needs Theory first.
This analysis first uses topic modeling to automatically identify major themes, usually by identifying informative words. Then we can see major topics making people happy. The topics seem to satisfy needs of people.
The words mentioned by people can be grouped to 2 topics. The first topic (left) is about self-actualization and esteem needs and the second topic (right) is about belongingness and love needs. According to the words in first topic, people mention day, time, watched, received and job. The second topic reveals people’s love needs from friends and families including husband and son.
In this part, we will explore the frequency of those topics mentioned by different group people and know what makes them happy.
From this graph, we can see that male are more eager to satisfy esteem and love needs.
After text cleaning, the analysis first explore which group responses more to this topic. The graphs show the age distributions of different groups classified by gender, marital and parenthood. These densities take into consideration the survey weights assigned to each observation.
We notice that the trends of female and male are similar while young man are happier than young woman. Number of responses from 28-year-old people reaches peak. After 50 years old, female are happier than male.
As suggested by the age distributions, we see that the single and married response more to this topic while the divorced, separated and widowed response less. When people are young, single status seems make them happier than married status. After 30-year-old, the married seems happier.
It seems that people without children are happier when they are young since they do not bother by kids. After 40-year-older, people with children seem to be happier because children can take care of them.
The result may affected by data collection. According to the graph, people in the US and India are happier than people live in other countries.
The whole dataset is divided by age, gender, marital and parenthood in this part, following by WordCloud and word frequency graphs for different people. From the graph, we can explore what contributes more to happiness of different people.
When it comes to age, people are grouped by the following criteria:
1. Young: 1-30 years old
2. Middle: 30-60 years old
3. Old: 60-100 years old
Let’s first see the overview of the whole dataset.
Overally, from barplot, we can see that people are more likely to talk about friend, day and time. Then words mentioned more are family, watched, home and played. In WordCloud, we can see other words mentioned by people, such as love, gifts, game and family-related words(daughter, husband, son), which conforms to common senses.
With the same method, we can see WordCloud and word frequency plot for specific groups. From the result, we know that the result of young people is similar to that of the result above. A little different from the whole dataset, middle-aged people are more likely to mention daughter, son and night. For old people, they are more rely on their spouses since they are more likely to mention words such as wife and husband.
Female are more likely to talk about family-related words such as husband, son, family, daughter and home. When it comes to male, they mention game besides family-related words.
The married are likely to talk about family, husband, wife, daughter and son. Those words about spouse indeed disappear in the words mention by divorced. They are more care about thier children and are more willing to mention other words such as money and boyfriend.
The difference between people who are parent and those who are not is very apparent. People who are parent are more care about their family while the words about children do not appear in the plot of people who are not parent.